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The  Consensus  Problem  in  Unreliable  Distributed  Systems 

(A  Brief  Survey)  ^ 

Michael  J.  Fischer 
Yale  University 
New  Haven,  Connecticut 

Abstract 

Agreement  problems  involve  a  system  of  processes,  some  of  which  may  be  faulty.  A  fundamental 
problem  of  fault-tolerant  distributed  computing  is  for  the  reliable  processes  to  reach  a  consensus. 

We  survey  the  considerable  literature  on  this  problem  that  has  developed  over  the  past  few  years 
and  give  an  informal  overview  of  the  major  theoretical  results  in  the  area. 

1.  Agreement  Problems 

To  achieve  reliability  in  distributed  systems,  protocols  are  needed  which  enable  the  system  as 
a  whole  to  continue  to  function  despite  the  failure  of  a  limited  number  of  components.  These 
protocols,  as  well  as  many  other  distributed  computing  problems,  requires  cooperation  among  the 
processes.  Fundamental  to  such  cooperation  is  the  problem  of  agreeing  on  a  piece  of  data  upon 
which  the  computation  depends.  For  example,  the  data  managers  in  a  distributed  database 
system  need  to  agree  on  whether  to  commit  or  abort  a  given  transaction  [20,  26].  In  a  replicated 
file  system,  the  nodes  might  need  to  agree  on  where  the  file  copies  are  supposed  to  reside  [19,  30). 
In  a  flight  control  system  for  an  airplane  [35],  the  engine  control  module  and  the  flight  surface 
control  module  need  to  agree  on  whether  to  continue  or  abort  a  landing  in  progress.  The  key 
point  here  is  not  what  the  processes  are  agreeing  on  but  the  fact  that  they  must  all  come  to  the 
same  conclusion. 

An  obvious  approach  to  achieving  agreement  is  for  the  processes  to  vote  and  agree  on  the 
majority  value.  In  the  absence  of  faults,  this  works  fine,  but  in  a  close  election,  the  vote  of  one 
faulty  process  can  swing  the  outcome.  Since  distinct  reliable  processes  might  receive  conflicting 
votes  from  a  faulty  process,  they  might  also  reach  conflicting  conclusions  about  the  outcome  of 
the  election  and  hence  fail  to  reach  agreement.  Davies  and  Wakerly  [2]  realised  this  difficulty 
and  proposed  a  multistage  voting  scheme  to  overcome  the  problem. 

fThis  work  *u  supported  in  part  by  the  Office  of  Naval  Research  under  Contract  N000 14-82- K-0154,  and  by  the 
National  Science  Foundation  under  Grant  MCS-81 18078. 
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A  simple  form  of  the  problem  is  to  achieve  consensus  on  a  single  bit.  Assume  a  fixed  number 
of  processors,  some  of  which  are  initially  faulty  or  may  fail  during  the  execution  of  the  protocol. 
Each  processor  i  has  an  initial  bit  Xj.  The  consensus  problem  is  for  the  non-faulty  processes  to 
agree  on  a  bit  y,  called  the  consensus  value.  More  precisely,  we  want  a  protocol  such  that  each 
reliable  process  i  eventually  terminates  with  a  bit  y^  and  y;  =  y  for  all  i. 

y  in  general  will  depend  in  some  way  on  the  initial  bits  x;.  In  the  absence  of  such  a 
requirement,  the  problem  becomes  trivial,  for  each  process  can  simply  choose  y;  —  0.  Some 
dependency  requirements  that  have  been  studied,  in  order  of  increasing  strength,  are: 

1.  (non-triviality):  For  each  y  6  {0,  1},  there  is  some  initial  vector  x(  and  some 
admissible  execution  of  the  protocol  in  which  y  is  the  consensus  value.  (The 
qualification  “admissible’’  allows  for  additional  restrictions,  such  as  bounds  on  the 
number  of  faulty  processes,  on  the  kinds  of  computations  we  are  willing  to  consider  ) 

2.  (weak  unanimity):  If  Xj  —  x  6  {0,  1}  for  all  i,  then  y  =  x,  provided  that  no  failures 
actually  occur  during  the  execution  of  the  protocol. 

3.  (strong  unanimity):  If  x;  =  x  6  {0,  1}  for  all  i,  then  y  =  x. 

Two  other  closely  related  problems  have  been  studied  extensively  in  the  literature.  The 
interactive  consistency  problem  is  like  the  consensus  problem  except  that  the  goal  of  the  protocol 
is  for  the  non-faulty  processes  to  agree  on  a  vector  y,  called  the  consensus  vector.  Again,  we  add 
dependency  requirements: 

1.  (weak):  for  each  j,  yj  ==  Xj  if  j  is  non-faulty,  provided  that  no  failures  actually  occur 
during  the  execution  of  the  protocol. 

2.  (strong):  for  each  j,  yj  =  Xj  if  j  is  non-faulty. 

Finally,  in  the  generals  problem  or  reliable  broadcast  problem,  one  assumes  a  distinguished 
processor  (the  “general”  or  “transmitter”)  which  is  trying  to  send  its  initial  bit  x  to  all  the  others. 
As  before,  all  the  reliable  processes  have  to  reach  consensus  on  a  bit,  and  we  add  dependency 
requirements: 

1.  (weak):  y  =  x  if  no  failures  occur  during  the  execution  of  the  protocol. 

2.  (strong):  y  —  x  if  the  general  is  non-faulty. 

Without  further  qualification,  any  reference  to  one  of  these  problems  will  refer  to  the  version 
with  the  strong  dependency  requirement. 


2.  Models  of  Computation 

The  kinds  of  solutions  that  can  be  obtained  to  agreement  problems  depend  heavily  od  the 
assumptions  made  about  the  model  of  computation  and  the  kinds  of  faults  to  which  it  is  prone. 
Throughout  this  paper,  we  will  assume  a  fixed  number  n  of  processes.  A  protocol  is  said  to  be 
t-resi/ient  if  it  operates  correctly  as  long  as  no  more  than  t  processes  fail  before  or  during 
execution. 

We  consider  two  kinds  of  processor  faults.  A  crash  occurs  when  a  process  stops  all  activity. 
Up  to  the  point  of  the  crash,  it  operates  correctly  and  after  that  it  is  completely  inactive.  A 
protocol  that  can  tolerate  up  to  t  crashed  processes  is  said  to  be  t -crash  resilient.  We  do  not 
concern  ourselves  with  the  problem  of  repairing  a  faulty  process  and  reintegrating  it  into  the 
system,  although  that  of  course  is  a  crucial  problem  in  the  practical  implementation  of  any  of 
these  ideas  (28]. 

A  more  disruptive  kind  of  failure  is  the  so-called  Byzantine  failure 1  in  which  no  assumptions 
are  made  about  the  behavior  of  a  faulty  process.  In  particular,  it  can  send  messages  when  it  is 
not  supposed  to,  make  conflicting  claims  to  other  processes,  act  dead  for  awhile  and  then  revive 
itself,  etc.  A  protocol  that  can  tolerate  up  to  t  processes  which  exhibit  Byzantine  failures  is  said 
to  be  l-Byzantine  resilient  and  is  sometimes  called  a  Byzantine  protocol.  The  problem  of 
finding  a  t-Byzantine  resilient  protocol  for  the  (weak)  generals  problem  is  called  the  (we ak) 
Byzantine  generals  problem. 

To  show  that  a  protocol  is  Byzantine  resilient,  one  has  to  consider  all  possible  faulty 
behaviors,  including  those  in  which  the  failed  processes  act  maliciously  against  the  protocol.  This 
doesn’t  mean  that  Byzantine  protocols  are  only  appropriate  in  adversary  situations.  The  folklore 
is  full  of  stories  in  which  systems  failed  in  bizarre  and  unexpected  ways,  and  in  the  absence  of 
good  ways  of  characterizing  the  kinds  of  failures  that  occur  in  practice,  protecting  against 
Byzantine  failures  is  a  conservative  approach  to  reliable  systems  design. 

We  assume  the  message  system  to  be  completely  reliable  and  that  only  processes  are  subject 
to  failure.  We  also  assume  that  any  process  can  reliably  determine  the  sender  of  any  message  it 
receives,  and  any  message  so  delivered  arrives  intact  and  without  errors.  Unless  stated  otherwise, 
we  assume  the  network  is  a  completely  connected  graph. 


'The  terminology  comes  from  [25],  in  which  a  fable  is  recounted  concerning  a  problem  of  military  communication'' 
in  times  of  old. 
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Of  course,  in  real  systems,  communication  links  as  well  as  processors  are  subject  to  failure. 
However,  a  link  failure  can  be  identified  with  the  failure  of  one  of  the  processors  at  its  two  end*, 
so  a  t-resilient  protocol  automatically  tolerates  up  to  t  process  and  link  failures.  Nevertheless, 
this  may  give  an  overly  pessimistic  view  of  the  reliability  of  the  system.  Reischuk  |32]  greatly 
refines  the  fault  assumptions,  enabling  him  to  obtain  more  informative  results  on  the  actual 
behaviors  of  the  systems. 

A  crucial  assumption  concerns  whether  or  not  the  failure  of  a  process  to  send  an  expected 
message  can  be  detected.  If  so,  then  the  expectant  receiver  gains  the  valuable  knowledge  that  the 
sender  is  faulty.  In  a  model  with  accurate  clocks  and  bounds  on  message  transit  times,  such 
detection  is  possible  through  the  use  of  timeouts.  (Of.  (21).)  Also,  detection  is  automatic  in  a 
synchronous  model  in  which  the  processes  run  in  lock  step  and  messages  sent  at  one  step  are 
received  at  the  next.  However,  detection  is  impossible  in  a  fully  asynchronous  model  in  which  do 
assumptions  are  made  about  relative  step  times  or  message  delays,  for  there  is  no  way  to  tell 
whether  the  sender  has  failed  or  is  just  running  very  slowly.  This  turns  out  to  have  a  profound 
effect  on  the  solvability  of  agreement  problems. 

We  use  the  terms  synchronous  and  asynchronous  to  distinguish  between  these  two  extreme 
cases,  while  remaining  fully  cognizant  of  the  fact  that  synchronous  message  behavior  can  be 
achieved  in  systems  with  weaker  assumptions  than  full  synchrony.  For  our  purposes,  we  will 
assume  that  a  synchronous  computation  proceeds  in  a  sequence  of  rounds.  In  each  round,  every 
process  first  sends  as  many  messages  as  it  wishes  to  other  processes,  and  then  it  receives  the 
messages  sent  to  it  by  other  processes.  Thus,  messages  received  during  a  round  cannot  affect  the 
messages  sent  during  the  same  round. 

One  further  significant  assumption  is  whether  or  not  the  model  supports  signatures.  We 
assume  that  the  author  of  a  signed  message  can  be  reliably  determined  by  anyone  holding  the 
message,  regardless  of  where  the  message  came  from  and  regardless  of  anything  that  the  faulty 
processes  might  have  done.  In  other  words,  signatures  cannot  be  forged  by  faulty  processes,  so  if 
C  receives  a  message  from  B  signed  by  A,  then  C  knows  that  A  really  sent  the  message  and  that 
it  was  not  fabricated  by  B.  Signatures,  too,  have  a  profound  effect  on  the  solvability  of 
agreement  problems.  We  sometimes  use  authenticated  to  refer  to  a  protocol  using  signed 
messages. 

Digital  signatures  can  be  implemented  using  cryptographic  techniques  [3,  4,  27,  33),  or  if  one 
is  willing  to  assume  that  faulty  processes  are  not  malevolent,  simple  signature  schemes  which  are 
not  cryptographically  secure  can  be  used  instead.  All  that  we  require  is  that  it  be  unlikely  for  a 


faulty  process  to  geuerate  a  valid  signature  of  some  other  process.  Note  that  no  special 
techniques  are  needed  to  implement  signatures  if  only  crashes  (and  not  Byzantine  failures)  arc 
considered,  for  then  no  incorrect  messages  are  ever  sent. 

The  practicality  of  agreement  protocols  depends  heavily  on  their  computational  complexity. 
Some  factors  that  might  be  important  are  the  amount  of  time  needed  to  complete  the  protocol, 
the  amount  of  message  traffic  generated,  or  the  amount  of  memory  needed  by  the  participants. 
All  of  these  quantities  are  in  general  dependent  on  which  faults  actually  occur  and  when.  A 
reasonable  assumption  in  many  situations  is  that  faults  happen  rarely,  so  it  is  acceptable  to  spend 
considerable  resources  handling  them,  but  one  wants  the  normal  case  to  be  handled  quite 
efficiently.  Note  however  that  in  a  very  large  system,  the  probability  of  at  least  one  fault  is  high, 
and  the  expected  number  of  faults  grows  linearly  with  the  size  of  the  system. 

We  measure  time  in  terms  of  the  number  of  rounds  of  message  exchange  that  take  place. 
Thus,  we  assume  every  process  can  potentially  exchange  messages  with  every  other  in  a  single 
unit  of  time.  Just  how  realistic  this  notion  of  time  is  depends  highly  on  the  structure  of  the 
message  system  and  on  the  reasonableness  of  the  assumption  that  a  process  can  really  send  or 
receive  n  messages  in  a  single  time  step.  We  measure  message  traffic  variously  as  the  total 
number  of  messages  sent,  the  total  number  of  bits  in  those  messages,  or  the  number  of  signatures 
(in  the  case  of  an  authenticated  protocol). 

3.  Relations  Among  Agreement  Problems 

The  three  agreement  problems  are  closely  related.  The  generals  problem  is  a  special  case  of 
the  interactive  consistency  problem  in  which  only  one  process's  initial  value  is  of  interest,  so  a 
protocol  achieving  interactive  consistency  also  solves  the  generals  problem.  Conversely,  n  copies 
of  a  protocol  for  the  generals  problem  can  be  run  in  parallel  to  solve  the  interactive  consistency 
problem. 

The  consensus  problem  appears  to  be  slightly  weaker  than  the  other  two.  An  interactive 
consistency  algorithm  can  be  modified  to  solve  the  consensus  problem  by  just  having  each  process 
choose  as  its  consensus  value  the  majority  value  in  the  consensus  vector.  This  works  as  long  as 
fewer  than  1/2  of  the  processes  are  faulty. 

Using  a  consensus  algorithm  to  solve  either  of  the  other  two  problems,  however,  seems  to 
require  an  additional  round  of  information  exchange.  For  example,  the  general’s  problem  can  be 
solved  as  follows: 
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Algorlthm  I 

1.  The  general  sends  its  value  to  each  of  the  other  processes. 

2.  All  of  the  processes  together  run  a  consensus  algorithm  using  as  initial  values  the 
bits  received  from  the  general  at  the  first  step.  (The  general  of  course  uses  its  own 
bit.) 

This  solves  the  generals  problem  since  if  the  general  is  reliable,  then  all  of  the  processes 
receive  the  same  value  in  step  1.  By  the  strong  unanimity  condition,  this  value  will  be  chosen  as 
the  consensus  value.  In  any  case,  agreement  is  reached.  The  extra  cost  is  one  additional  round 
of  1-bit  messages  in  step  1.  Thus,  we  have  proved: 

Theorem  1.  Given  a  t-resilient  solution  to  the  consensus  problem,  there  is  a  t-resilient 
solution  to  the  generals  problem  which  uses  one  more  “round”  of  message  exchange  and 
sends  n— 1  additional  messages  of  1-bit  each. 

Many  solutions  to  the  generals  problem  have  the  general  structure  of  Algorithm  1  and  thus 
appear  to  have  embedded  within  them  solutions  to  the  consensus  problem,  seemingly  obviating 
the  need  for  Algorithm  I  and  the  extra  round  of  messages.  However,  the  embedded  consensus 
algorithm  does  not  necessarily  solve  the  full  consensus  problem,  for  the  case  in  which  the  general 
is  reliable  yet  the  Xj’s  are  not  all  the  same  can  never  arise  when  the  x;’s  are  obtained  from  the 
general  on  the  first  step. 

Similar  remarks  apply  to  the  corresponding  weak  versions  of  these  problems.  In  fact,  a  weak 
Byzantine  generals  algorithm  solves  the  weak  consistency  problem  directly  (without  first  using  it 
to  solve  the  interactive  consistency  problem),  for  if  all  the  initial  values  are  the  same  and  no 
process  is  faulty,  then  it  suffices  to  simply  agree  on  the  general’s  value.  There  is  not,  however, 
any  readily  apparent  way  to  use  a  solution  to  any  of  the  weak  versions  of  the  agreement  problem 
to  solve  any  of  the  strong  ones.  In  fact,  for  a  slightly  different  “approximate”  agreement 
problem,  Lamport  [22]  shows  that  the  weak  version  has  a  solution  whereas  the  strong  one  does 
not. 


4.  Solvability  of  Agreement  Problems 

Perhaps  the  most  basic  question  to  ask  of  a  proposed  agreement  problem  is  whether  or  not  it 
has  a  solution  at  all.  By  the  previous  discussion  and  Theorem  1  the  consensus  problem  and  the 
interactive  consistency  problem  have  t-resilient  solutions  iff  the  generals  problem  does,  so  we  will 
restrict  attention  to  the  latter  problem  in  this  section. 

Consider  first  the  synchronous  case.  With  signatures,  Pease,  Shostak,  and  Lamport  [25,  29] 
give  a  t-resilient  solution  for  any  t. 

I 

i 

* 

i 
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Theorem  2.  There  is  a  t-resilient  authenticated  synchronous  protocol  which  solves  the 
strong  (weak)  Byzantine  generals  problem. 

Briefly,  the  protocol  consists  of  t+1  rounds.  In  the  first  round,  the  general  sends  a  signed 
message  with  its  value  to  each  other  process.  At  each  round  thereafter,  each  process  adds  its 
signature  to  each  valid  message  received  from  the  previous  round  and  sends  it  to  all  processes 
whose  signature  does  not  already  appear  on  the  message.  A  message  received  during  round  k  is 
valid  if  it  bears  exactly  k  distinct  signatures,  the  first  of  which  is  the  general’s.  Let  V;  be  the  set 
of  values  contained  in  all  the  valid  messages  received  by  i  through  the  end  of  round  t+1.  If  V.  is 
a  singleton,  then  that  value  is  chosen  as  the  consensus  value.  Otherwise,  a  fixed  constant  NIL  is 
chosen. 

To  prove  agreement,  we  argue  that  if  i  and  j  are  both  reliable,  then  V;  —  V..  There  are  two 
cases  to  consider.  If  the  general  is  reliable,  then  both  Vj  and  Vj  consist  solely  of  the  general’s 
value,  since  no  other  value  ever  appears  in  a  valid  message.  Otherwise,  consider  the  message  M 
from  which  i  first  learned  of  v.  M  consists  of  v  followed  by  a  list  of  distinct  signatures  nij.  ..., 
mk,  the  first  of  which  is  the  general’s,  and  k  <  t+1.  If  k  <  t+1  and  process  j  does  not  already 
know  about  v,  then  j  learns  of  v  from  i  on  the  next  round.  If  k  =  t+1,  then  m.,  ...,  mt  are  all 
faulty  or  else  i  would  have  learned  of  v  earlier.  But  then  mt+1  is  reliable,  so  j  learns  of  v  at  the 
same  round  as  i.  Correctness  of  the  protocol  easily  follows. 

Without  signatures,  there  is  a  solution  if  and  only  if  the  fraction  of  faulty  processes  is  not  too 
large. 

Theorem  3.  There  is  a  t-resilient  synchronous  protocol  without  authentication  which 
solves  the  strong  (weak)  Byzantine  generals  problem  iff  t/n  <  1/3. 

The  impossibility  argument  for  t/n  >  1/3  appears  in  [25,  29]  for  the  strong  case  and  in  [22] 
for  the  weak  case  of  the  problem.  Protocols  demonstrating  the  solvability  of  both  problems  for 
t/n  <  1/3  appear  in  [25,  29].  Various  protocols  have  since  appeared  with  additional  desirable 
properties,  some  of  which  will  be  discussed  later  in  this  paper. 

In  the  fully  asynchronous  case,  there  is  no  solution.  In  fact,  Fischer,  Lynch,  and  Paterson  [18] 
show  that  the  problem  remains  unsolvable  even  with  much  weaker  requirements: 

Theorem  4.  In  a  fully  asynchronous  environment,  there  is  no  1-crash  resilient  solution 
to  the  consensus  problem,  even  when  only  the  non-triviality  condition  is  required. 

The  proof  is  by  contradiction.  In  general  outline,  one  assumes  the  existence  of  such  a 
protocol.  The  protocol  is  committed  to  the  eventual  consensus  value  at  a  certain  point  in  time  if 
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t  here  after  only  the  one  value  is  a  possible  outcome,  no  matter  how  processes  are  scheduled  or 
how  messages  are  delivered.  One  shows  that  at  least  for  some  initial  configuration,  the  outcome 
is  not  already  committed.  Starting  from  there,  one  constructs  an  infinite  computation  such  that 
the  system  forever  stays  uncommitted,  contradicting  the  assumed  correctness  of  the  protocol. 
The  details  get  somewhat  involved  since  it  is  necessary  to  insure  that  the  infinite  computation 
results  from  a  “fair”  schedule.  The  interested  reader  is  referred  to  [18). 

Returning  to  the  Byzantine  generals  problem  in  a  synchronous  environment,  we  consider 
weaker  connectivity  assumptions  on  the  network  which  nonetheless  permit  a  solution.  With 
signatures,  Lamport  et  al.  [25]  show  that  the  Byzantine  Generals  problem  can  b-  solved  in  any 
network  in  which  the  reliable  processes  are  connected.  Without  signatures,  »  show  that  a 
solution  is  possible  in  a  3t- “regular”  graph.  Dolev  [5,  6j  extends  this  latter  res  o  completely 
characterize  the  networks  in  which  the  problem  is  solvable: 

Theorem  5.  Consider  a  synchronous  network  with  connectivity  k  having  n  ^ssors,  t 
of  which  may  be  faulty.  Then  the  Byzantine  generals  problem  is  solvable  without 
authentication  iff  t/n  <  1/3  and  t/k  <  1/2. 

Three  recent  unpublished  results  deserve  brief  mention,  all  of  which  extend  the  asynchronous 
model  slightly  in  order  to  avoid  the  assumptions  of  Theorem  4.  Ben-Or  [l]  allows  randomized 
algorithms  and  shows  that  crash-resilient  consensus  is  achievable  with  probability  1  when  t/n  < 
1/2,  and  Byzantine-resilient  consensus  is  achievable  with  probability  1  when  t/n  <  1/5.  Rabin 
[31]  uses  randomized  algorithms  with  an  initial  random  “deal”  and  signatures  to  achieve  certain 
agreement  with  an  expected  number  of  rounds  that  is  only  4,  independent  of  n  and  t,  so  long  as 
t/n  <  1/4.  Finally,  Dolev,  Dwork,  and  Stockmeyer  [7]  distinguish  among  the  different  kinds  of 
asynchrony  in  the  model  of  [18]  to  get  tighter  conditions  on  when  consensus  protocols  are  and  are 
not  possible. 

5.  Complexity  Results 
6.1.  Upper  Bounds 

The  t-resiiient  Byzantine  generals  algorithms  of  [25,  29]  take  time  t+1  and  send  a  number  of 
message  bits  that  is  exponential  in  t.  The  first  algorithm  to  use  only  a  polynomial  number  of 
message  bits  was  found  by  Dolev  and  Strong  [12]  and  subsequently  improved  by  Fischer,  Fowler, 
and  Lynch  [16].  The  still  stronger  result  below  is  from  [8], 
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Theorem  8.  Let  t/n  <  1/3.  There  is  a  t-resilient  solution  without  authentication  to 
the  Byzantine  generals  problem  which  uses  2t  +  3  rounds  of  information  exchange  and 
G(nt  +  t3  log  t)  message  bits. 

It  remains  an  open  problem  if  there  is  any  unauthenticated  algorithm  which  simultaneously 
achieves  fewer  than  2t  ■+■  3  rounds  and  uses  only  polynomially  many  message  bits. 

With  authentication,  and  counting  number  of  messages  instead  of  message  bits,  we  get: 

Theorem  7. 

(a)  There  is  a  t-resilient  authenticated  solution  to  the  Byzantine  generals  problem 
which  uses  t+ 1  rounds  and  sends  O(nt)  messages; 

(b)  There  is  a  t-resilient  authenticated  solution  to  the  Byzantine  generals  problem 
which  uses  O(t)  rounds  and  sends  only  0(n+t2)  messages. 

Part  (a)  was  shown  by  Dolev  and  Strong  [15],  and  part  (b)  was  shown  by  Dolev  and  Reischuk 

[io]. 

For  practical  applications,  these  bounds  are  not  very  encouraging,  especially  the  t-fl  bound 
on  the  number  of  rounds.  As  we  shall  see,  this  bound  cannot  be  improved  in  the  worst  case  that 
t  faults  actually  occur.  However,  Dolev,  Reischuk  and  Strong  [11,  14]  have  looked  at  the  question 
of  whether  Byzantine  generals  solutions  exist  which  stop  early  when  fewer  faults  occur.  The 
answer  depends  on  whether  synchronization  upon  termination  is  aiso  required. 

For  definiteness,  we  say  that  a  process  halts  within  r  rounds  if  it  is  non-fault.,  and  it  chooses 
its  consensus  value  and  enters  a  stopping  state  before  sending  or  receiving  any  round  r+1 
messages.  It  halts  in  round  r  if  it  halts  within  r  rounds  but  does  not  halt  within  r— 1  rounds.  An 
agreement  protocol  terminates  when  all  reliable  processes  have  halted.  If  it  terminates,  we  say  it 
reaches  immediate  agreement  if  all  reliable  processes  halt  in  the  same  round,  and  it  reaches 
eventual  agreement  otherwise.  Thus,  immediate  agreement  serves  to  synchronize  the  processes  as 
well  as  enabling  them  to  agree  on  a  value.  Note  that  all  of  the  protocols  discussed  previously 
achieve  immediate  agreement  since  all  processes  choose  tl°ir  consensus  value  in  the  last  round. 

The  following  theorem  is  from  [11]: 

Theorem  8.  Let  t/n  <  1/3.  There  is  a  t-resilient  protocol  without  authentication 
which  solves  the  Byzantine  generals  problem  and  reaches  eventual  agreement  within 
min(2t  +  3,  2f  +  5)  rounds,  where  f  <  t  is  the  actual  number  of  faults. 

The  same  paper  also  contains  a  more  refined  protocol  which  stops  even  earlier  when  t  is  only 
about  \/d7 
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If  one  assumes  processes  can  fail  only  by  crashing,  then  Lamport  and  Fischer  show  that  these 
bounds  can  be  improved  [23]. 

Theorem  9.  There  is  a  t-crash  resilient  protocol  (without  authentication)  which  solves 
the  generals  problem  and  reaches  eventual  agreement  by  the  end  of  round  f+2,  where  f 
<  t  is  the  actual  number  of  crashes. 

We  give  the  protocol  and  sketch  its  proof.  There  are  only  four  possible  messages  —  0,1, 
NIL,  and  <j>.  0,  1  are  the  two  possible  initial  values  of  the  general,  ^  mean a  “I  don't  know",  and 
NIL  is  a  default  consensus  value  which  is  chosen  when  crashes  prevent  the  reliable  processes  from 
discovering  the  general’s  value. 

Algorithm  II 

A.  Round  1:  Process  1  (the  general)  sends  its  value  to  every  process. 

B.  Round  r,  1  <  r  <  t+ls  Each  process  does  the  following: 

1.  If  it  received  a  value  v  €  {0,  1,  NIL)  from  any  process  in  round  r— 1,  then  it: 

•  takes  v  as  its  consensus  value; 

•  sends  v  to  every  process; 

•  halts. 

2.  Otherwise,  if  it  received  <t>  during  round  r—  1  from  every  process  not  known  to  have 
crashed  before  the  beginning  of  that  round,  then  it: 

•  takes  NIL  as  its  consensus  value; 

•  sends  NIL  to  every  process; 

•  halts. 

(It  knows  a  process  has  crashed  if  it  failed  to  receive  an  expected  message  from  it 
during  the  previous  round.) 

3.  Otherwise,  it  sends  <t>  to  every  process. 

C.  End  of  Round  t+1:  Each  process  that  has  not  halted  does  the  following: 

1.  If  it  received  a  value  v  6  (0,  1,  NIL}  from  any  process  during  round  t+1,  then  it 
takes  v  as  its  consensus  value  and  halts. 

2.  Otherwise,  it  chooses  NIL  as  its  consensus  value  and  halts. 

Correctness  of  the  algorithm  follows  readily  from  the  following  facts.  Recall  that  a  crashed 
process  is  not  considered  to  be  halted. 

1.  If  some  process  halts  at  step  B1  or  B2  during  round  r  and  chooses  value  v,  then 

every  other  process  which  halts  at  step  B1  or  B2  during  round  r  also  chooses  v. 

2.  If  some  process  halts  at  step  B1  or  B2  during  round  r  and  chooses  value  v,  then 

every  reliable  process  which  has  not  already  halted  will  choose  v  and  halt  at  step  B1 

in  round  r+1  (if  r  <  t+1)  or  at  step  Cl  in  round  t+1  (if  r  —  t+1). 


3.  If  no  process  crashes  or  halts  during  round  r  >  1,  then  ^  is  the  only  message  sent 
during  that  round. 

4.  If  any  process  terminates  at  step  C2  in  rc  -id  t+1,  then  all  reliable  processes  do. 

Moreover,  if  fewer  than  k  processes  crash  in  t,he  first  k  rounds,  then  the  protocol  terminates 
within  k-fl  rounds;  hence  if  there  are  at  most  f  crashes,  then  the  protocol  terminates  within  f+2 
rounds. 

A  more  elaborate  protocol  with  similar  abstract  properties  but  which  is  quite  possibly  more 
efficient  in  practice  appears  in  [34]. 

5.2.  Lower  Bounds 

All  of  the  protocols  above  use  t+1  rounds  in  the  worst  case.  Fischer  and  Lynch  [17]  present  a 
proof  that  t+1  rounds  are  necessary  for  achieving  interactive  consistency  without  signatures  and 
hence  also  for  solving  the  unauthenticated  Byzantine  generals  problem.  Several  people  have 
extended  this  result  in  one  way  or  another.  DeMillo,  Lynch,  and  Merritt  [3,  27]  and 
independently  Dolev  and  Strong  [12,  IS]  show  that  the  t+1  lower  bound  holds  for  authenticated 
solutions  to  the  Byzantine  generals  problem.  Lamport  and  Fischer  [23],  by  a  similar  proof,  show 
that  the  same  bound  holds  assuming  that  the  protocol  is  only  crash  resilient  and  solves  the  weak 
consensus  problem,  but  they  did  not  consider  the  authenticated  case.  We  summarize  these 
results  below. 

Theorem  10.  Assume  t  <  n— 2. 

(a)  Every  t-resilient  protocol  without  signatures  for  the  weak  consensus  problem 
uses  at  least  t+1  rounds  of  message  exchange  in  the  worst  case. 

(b)  Every  t-resilient  authenticated  protocol  for  the  Byzantine  generals  problem  uses 
at  least  t+1  rounds  of  message  exchange  in  the  worst  case. 

We  note  that  the  weak  consensus  problem  has  not  been  explicitly  studied  with  signed 
messages,  but  we  conjecture  that  the  same  bound  will  still  hold. 

We  sketch  the  basic  structure  underlying  these  proofs,  although  much  more  is  involved  in 
really  making  them  go  through.  For  two  distinct  computations  S  and  T,  define  S  ~  T  if  S  and 
T  “look”  the  same  to  some  reliable  process  p,  that  is,  p  receives  the  same  messages  and  behaves 
exactly  the  same  in  both  S  and  T.  Hence,  p  chooses  the  same  consensus  value  in  each,  which 
must  be  the  consensus  value  for  both  S  and  T.  Now,  the  proof  proceeds  by  assuming  at  most  t 
rounds  and  then  constructing  a  sequence  of  t- round  computations  SQ,  S(,  ...,  Sk  such  that  SQ  has 
consensus  value  0,  Sfc  has  consensus  value  1,  and  Sj  j  ~  Sj  for  1  <  i  <  k.  This  results  in  a 
contradiction.  The  constructions  need  one  faulty  process  per  round;  hence,  they  cannot  be  used 
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to  find  computations  of  more  than  t  rounds. 

Dolev  and  Strong  [14]  show  that  t+1  rounds  are  needed  in  a  t-resiiient  immediate  Byzantine 
generals  protocol  even  when  the  actual  number  of  failures  is  less.  These  theorems  also  appear 
without  proofs  in  (llj. 

Theorem  11.  Let  t  <  n— 2,  and  let  P  be  a  t-resilient  (authenticated)  protocol  solving 
the  Byzantine  generals  problem  which  always  reaches  immediate  agreement.  Then  it  is 
possible  for  P  to  run  for  at  least  t+1  rounds  even  when  there  are  no  faults. 

In  the  case  of  eventual  agreement,  they  prove  the  following: 

Theorem  12.  Let  P  be  a  t-resilient  (authenticated)  protocol  solving  the  Byzantine 
generals  problem  which  reaches  eventual  agreement,  and  let  f  <  t.  Then  it  is  possible 
for  P  to  run  for  at  least  f+2  rounds  with  only  f  faults. 

We  conjecture  that  this  can  be  extended  to  t-crash  resiliant  generals  protocols,  which  would  then 
show  the  optimality  of  9. 

Finally,  we  look  at  lower  bounds  on  the  number  of  messages  and  signatures  needed.  Dolev 
and  Reischuk  [10]  show: 

Theorem  13.  The  total  number  of  messages  and  signatures  in  any  t-resilient 
(authenticated)  Byzantine  generals  solution  is  f?(nt). 

Theorem  0  shows  that  this  bound  is  tight  when  n  is  large  relative  to  t.  If  one  counts  only 
messages,  then  they  show 

Theorem  14.  The  total  number  of  messages  in  any  t-resilient  (authenticated) 
Byzantine  generals  solutions  is  17(n  +  t2)). 

Theorem  7,  part  (b)  shows  this  bound  “best  possible”  for  authenticated  algorithms. 

0.  Applications  of  Agreement  Protocols 

The  abstract  versions  of  agreement  problems  considered  in  this  survey  are  not  general  enough 
to  be  directly  applicable  to  many  practical  situations.  We  mention  here  some  extensions  and 
applications  of  these  problems. 

First  of  all,  one  often  wants  to  reach  agreement  on  a  value  from  a  larger  domain  than  just 
(0,  1}.  If  the  domain  has  v  elements,  then  one  can  encode  the  elements  in  binary  and  run 
[log2  v]  copies  of  the  agreement  protocol,  one  for  each  bit,  but  more  efficient  algorithms  might  be 
possible.  In  applications  such  as  clock  synchronization,  the  domain  of  values  can  be  taken  to  be 
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the  real  numbers,  and  only  approximate  agreement  is  needed.  Lamport  and  Melliar-Smith  [24] 
studies  the  clock  synchronization  problem,  and  Dolev,  Lynch,  and  Pinter  [0]  took  at  the  abstract 
approximate  agreement  problem. 

A  difficult  part  of  implementing  these  algorithms  is  building  message  systems  which  actually 
have  the  reliability  and  synchronization  properties  that  were  assumed  in  the  models.  Real 
distributed  systems  are  quasi-asynchronous,  and  to  avoid  the  difficulties  of  Theorem  4  one  must 
make  reasonable  timing  assumptions  and  make  effective  use  of  clocks  and  timeouts.  Lamport 
[21]  gives  some  insights  as  to  how  this  can  be  done. 

Finally,  we  should  mention  the  papers  by  Dolev  and  Strong  [13]  and  Mohan,  Strong,  and 
Finkelstein  [28]  that  describe  serious  attempts  to  apply  agreement  protocols  to  real  problems  of 
distributed  databases. 
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